---
title: "Guide: Building a Research Paper Assistant with RAG | Mastra RAG Guides"
description: Guide on creating an AI research assistant that can analyze and answer questions about academic papers using RAG.
---

import Steps from "@site/src/components/Steps";
import StepItem from "@site/src/components/StepItem";

# Building a Research Paper Assistant with RAG

In this guide, you'll create an AI research assistant that can analyze academic papers and answer specific questions about their content using Retrieval Augmented Generation (RAG).

You'll use the foundational Transformer paper ["Attention Is All You Need"](https://arxiv.org/html/1706.03762) as your example. As a database you'll use a local LibSQL database.

## Prerequisites

- Node.js `v22.13.0` or later installed
- An API key from a supported [Model Provider](/models/v1)
- An existing Mastra project (Follow the [installation guide](/guides/v1/getting-started/quickstart) to set up a new project)

## How RAG works

Let's understand how RAG works and how you'll implement each component.

### Knowledge Store/Index

- Converting text into vector representations
- Creating numerical representations of content
- **Implementation**: You'll use OpenAI's `text-embedding-3-small` to create embeddings and store them in LibSQLVector

### Retriever

- Finding relevant content via similarity search
- Matching query embeddings with stored vectors
- **Implementation**: You'll use LibSQLVector to perform similarity searches on the stored embeddings

### Generator

- Processing retrieved content with an LLM
- Creating contextually informed responses
- **Implementation**: You'll use GPT-4o-mini to generate answers based on retrieved content

Your implementation will:

1. Process the Transformer paper into embeddings
2. Store them in LibSQLVector for quick retrieval
3. Use similarity search to find relevant sections
4. Generate accurate responses using retrieved context

## Creating the Agent

Let's define the agent's behavior, connect it to your Mastra project, and create the vector store.

<Steps>

<StepItem title="Install additional dependencies">

Install additional dependencies

After running the [installation guide](/guides/v1/getting-started/quickstart) you'll need to install additional dependencies:

```bash copy
npm install @mastra/rag@beta ai@^4.0.0
```
</StepItem>

<StepItem title="Define the Agent">

Now you'll create your RAG-enabled research assistant. The agent uses:

- A [Vector Query Tool](/reference/v1/tools/vector-query-tool) for performing semantic search over the vector store to find relevant content in papers
- GPT-4o-mini for understanding queries and generating responses
- Custom instructions that guide the agent on how to analyze papers, use retrieved content effectively, and acknowledge limitations

Create a new file `src/mastra/agents/researchAgent.ts` and define your agent:

```ts copy title="src/mastra/agents/researchAgent.ts"
import { Agent } from "@mastra/core/agent";
import { ModelRouterEmbeddingModel } from "@mastra/core/llm";
import { createVectorQueryTool } from "@mastra/rag";

// Create a tool for semantic search over the paper embeddings
const vectorQueryTool = createVectorQueryTool({
  vectorStoreName: "libSqlVector",
  indexName: "papers",
  model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
});

export const researchAgent = new Agent({
  id: "research-agent",
  name: "Research Assistant",
  instructions: `You are a helpful research assistant that analyzes academic papers and technical documents.
    Use the provided vector query tool to find relevant information from your knowledge base,
    and provide accurate, well-supported answers based on the retrieved content.
    Focus on the specific content available in the tool and acknowledge if you cannot find sufficient information to answer a question.
    Base your responses only on the content provided, not on general knowledge.`,
  model: "openai/gpt-5.1",
  tools: {
    vectorQueryTool,
  },
});
```

</StepItem>

<StepItem title="Create the Vector Store">

In the root of your project, grab the absolute path with the `pwd` command. The path might be similar to this:

```bash
> pwd
/Users/your-name/guides/research-assistant
```

In your `src/mastra/index.ts` file, add the following to your existing file and configuration:

```ts copy title="src/mastra/index.ts" {2, 4-6, 9}
import { Mastra } from "@mastra/core";
import { LibSQLVector } from "@mastra/libsql";

const libSqlVector = new LibSQLVector({
  id: 'research-vectors',
  connectionUrl: "file:/Users/your-name/guides/research-assistant/vector.db",
});

export const mastra = new Mastra({
  vectors: { libSqlVector },
});
```

For the `connectionUrl` use the absolute path you got from the `pwd` command. This way the `vector.db` file is created at the root of your project.

:::note

For the purpose of this guide you are using a hardcoded absolute path to your
local LibSQL file, however for production usage this won't work. You should
use a remote persistent database then.

:::

</StepItem>

<StepItem title="Register the Agent with Mastra">

In the `src/mastra/index.ts` file, add the agent to Mastra:

```ts copy title="src/mastra/index.ts" {3, 10}
import { Mastra } from "@mastra/core";
import { LibSQLVector } from "@mastra/libsql";
import { researchAgent } from "./agents/researchAgent";

const libSqlVector = new LibSQLVector({
  id: 'research-vectors',
  connectionUrl: "file:/Users/your-name/guides/research-assistant/vector.db",
});

export const mastra = new Mastra({
  agents: { researchAgent },
  vectors: { libSqlVector },
});
```

</StepItem>

</Steps>

## Processing documents

In the following steps you'll fetch the research paper, split it into smaller chunks, generate embeddings for them, and store these chunks of information into the vector database.

<Steps>

<StepItem title="Load and Process the Paper">

In this step the research paper is retrieved by providing an URL, then converted to a document object, and split into smaller, manageable chunks. By splitting into chunks the processing is faster and more efficient.

Create a new file `src/store.ts` and add the following:

```ts copy title="src/store.ts"
import { MDocument } from "@mastra/rag";

// Load the paper
const paperUrl = "https://arxiv.org/html/1706.03762";
const response = await fetch(paperUrl);
const paperText = await response.text();

// Create document and chunk it
const doc = MDocument.fromText(paperText);
const chunks = await doc.chunk({
  strategy: "recursive",
  maxSize: 512,
  overlap: 50,
  separators: ["\n\n", "\n", " "],
});

console.log("Number of chunks:", chunks.length);
```

Run the file in your terminal:

```bash copy
npx bun src/store.ts
```

You should get back this response:

```bash
Number of chunks: 892
```

</StepItem>

<StepItem title="Create and Store Embeddings">

Finally, you'll prepare the content for RAG by:

1. Generating embeddings for each chunk of text
2. Creating a vector store index to hold the embeddings
3. Storing both the embeddings and metadata (original text and source information) in the vector database

:::note

This metadata is crucial as it allows for returning the actual content when
the vector store finds relevant matches.

:::

This allows the agent to efficiently search and retrieve relevant information.

Open the `src/store.ts` file and add the following:

```ts copy title="src/store.ts"
import { MDocument } from "@mastra/rag";
import { embedMany } from "ai";
import { mastra } from "./mastra";

// Load the paper
const paperUrl = "https://arxiv.org/html/1706.03762";
const response = await fetch(paperUrl);
const paperText = await response.text();

// Create document and chunk it
const doc = MDocument.fromText(paperText);
const chunks = await doc.chunk({
  strategy: "recursive",
  maxSize: 512,
  overlap: 50,
  separators: ["\n\n", "\n", " "],
});

// Generate embeddings
const { embeddings } = await embedMany({
  model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
  values: chunks.map((chunk) => chunk.text),
});

// Get the vector store instance from Mastra
const vectorStore = mastra.getVector("libSqlVector");

// Create an index for paper chunks
await vectorStore.createIndex({
  indexName: "papers",
  dimension: 1536,
});

// Store embeddings
await vectorStore.upsert({
  indexName: "papers",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({
    text: chunk.text,
    source: "transformer-paper",
  })),
});
```

Lastly, you'll now need to store the embeddings by running the script again:

```bash copy
npx bun src/store.ts
```

If the operation was successful you should see no output/errors in your terminal.

</StepItem>

</Steps>

## Test the Assistant

Now that the vector database has all embeddings, you can test the research assistant with different types of queries.

Create a new file `src/ask-agent.ts` and add different types of queries:

```ts title="src/ask-agent.ts" copy
import { mastra } from "./mastra";
const agent = mastra.getAgent("researchAgent");

// Basic query about concepts
const query1 =
  "What problems does sequence modeling face with neural networks?";
const response1 = await agent.generate(query1);
console.log("\nQuery:", query1);
console.log("Response:", response1.text);
```

Run the script:

```bash copy
npx bun src/ask-agent.ts
```

You should see output like:

```bash
Query: What problems does sequence modeling face with neural networks?
Response: Sequence modeling with neural networks faces several key challenges:
1. Vanishing and exploding gradients during training, especially with long sequences
2. Difficulty handling long-term dependencies in the input
3. Limited computational efficiency due to sequential processing
4. Challenges in parallelizing computations, resulting in longer training times
```

Try another question:

```ts title="src/ask-agent.ts" copy
import { mastra } from "./mastra";
const agent = mastra.getAgent("researchAgent");

// Query about specific findings
const query2 = "What improvements were achieved in translation quality?";
const response2 = await agent.generate(query2);
console.log("\nQuery:", query2);
console.log("Response:", response2.text);
```

Output:

```
Query: What improvements were achieved in translation quality?
Response: The model showed significant improvements in translation quality, achieving more than 2.0
BLEU points improvement over previously reported models on the WMT 2014 English-to-German translation
task, while also reducing training costs.
```

### Serve the Application

Start the Mastra server to expose your research assistant via API:

```bash
mastra dev
```

Your research assistant will be available at:

```
http://localhost:4111/api/agents/researchAgent/generate
```

Test with curl:

```bash
curl -X POST http://localhost:4111/api/agents/researchAgent/generate \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "user", "content": "What were the main findings about model parallelization?" }
    ]
  }'
```

## Advanced RAG Examples

Explore these examples for more advanced RAG techniques:

- [Filter RAG](/examples/v1/rag/usage/filter-rag) for filtering results using metadata
- [Cleanup RAG](/examples/v1/rag/usage/cleanup-rag) for optimizing information density
- [Chain of Thought RAG](/examples/v1/rag/usage/cot-rag) for complex reasoning queries using workflows
- [Rerank RAG](/examples/v1/rag/usage/cleanup-rag) for improved result relevance
